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Abstract 

A method is presented to tackle the sign problem in the simulations of systems 
having indefinite or complex- valued measures. In general, this new approach is shown 
to yield statistical errors smaller than the crude Monte Carlo using absolute values of 
the original measures. Exactly solvable, one- dimensional Ising models with complex 
temperature and complex activity illustrate the considerable improvements and the 
workability of the new method even when the crude one fails. 



Introduction 



Numerical simulations have opened up new directions in the study of many interesting 
problems, offering an alternative and complementary method when the problems are ana- 
lytically managable and the only systematic method in more complicated problems. How- 
ever, in the case of general dynamical systems even numerical methods have difficulty due 
to the fundamental 'sign problem' when the measures of generating functions are not posi- 
tive definite or are complex [f], invalidating the probabilistic interpretation of conventional 
Monte Carlo (MC) simulations. Many interesting and important physical systems, unfor- 
tunately, belong to this class with the sign problem. Examples include: real-time path 
integrals of quantum mechanics and quantum field theory, lattice QCD at finite tempera- 
ture and density, chiral gauge theory, quantum statistical systems with fermions. We will 
consider later the Ising models in a complex magnetic field or with complex temperature. 

Many approaches have been proposed for the sign problem but so far none is satis- 
factory. Complex Langevin simulations [2] cannot be shown to converge to the desire 
distributions and often fail to do so. Others [3] are either restricted to too small a lattice, 
too complicated, or not general enough or speculative. 

Following is the crude approach of the average sign [4] , upon which we want to improve 
in the next section. 

If the measure p[x) of a generating function suffers from sign fiuctuation then another 
positive definite function p[x) must be chosen for the MC evaluation of the expectation 
value of an obsevable 0, 

(0) = [{Q{x)p{x)/p{x))p{x)/ [{p{x)/p{x))p{x), 

^ m)im- (1) 

In general it is desirable to choose p independently of 0, and we will concentrate on the 
estimate of the denominator ((f)) because of its appearance in all the measurements. It is 
a simple variational problem to show that the MC probability density, properly normalised, 
which minimises the variance of ((f)) which is cr/ -^^independent configurations, where 

a' = f\p{x)/p{x)-{{l))fp{x), (2) 

must be 

p{x) = \p{x)\ I J \p{x)\ . (3) 

Thus the sign of p[x) is now treated as part of the quantity whose expectation is to be 
measured. 

However, when the denominator ((f)) is vanishingly small, cr^, though minimised, is 
~ I ^ ((1)) since p/\p\ = ±1. Then the evaluation of (0) becomes unreliable unless the 
number of independent configurations is many orders of magnitude greater than the large 
number I/((I))^. The fiuctuation of sign of the measure over configuration space thus 
renders ineffective the sampling guided by this crude MC method (3). This is the content 
of the sign problem. 
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A new approach 



To deal with complex integrands, of which indefinite measures are special cases, we adopt 
the definition (2) of the variance extended to the absolute values of complex numbers. 
Statistical analysis from this definition is the same as the standard analysis; except that 
the range of uncertainty should now be depicted as the radius of an 'uncertainty circle' 
centred on some central value in the complex plane. The variational proof leading to (3) 
also remains intact. 

We can write the generating function as integrals over two configurational subspaces 
fx Pi^) ~ fx Iy p{^7 For example, X and Y are the field values on two non-overlapping 
sublattices. 

We will choose the subspace partition in such a way that the multi-dimensional integral 
over Y can be evaluated analytically or well approximated: 

g{X) = J^p{X,Y). (4) 

In the example of the next section, due to the short range interactions, Y is chosen to be 
the subspace of configurations of non-interacting spins residing on even sites of the lattice, 
and an explicit expression for g can thus be obtained. 

As in the last section, one can easily prove that the MC weight g that minimises 

a" = j \g{X)lg{X)-{m"Q{X). (5) 

•J X 

is 

g{X) = \g{X)\/ j^\g{X)\. (6) 
It then follows that the variance for this new weight is not bigger than that for p{X,Y). 

= j \g{X)ng{X)- [ j \p{XX)? Ip{XX). 

•J X J X JY 

2 / r (■ \ 2 
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the second line follows from (3) and (6); the last line is from (4) and always less than or 
equal to zero because of the triangle inequality. The equality occurs if and only if p{X, Y) is 
semi-definite (either positive or negative) for all X-configuration. In particular, when there 
is no sign problem in the first place, expression (5) yields the same statistical deviation as 
the crude one. 

Our approach is now clear. The measures are first summed over a certain subspace, 
the integration (4) above, to facilitate some partial phase cancellation. Absolute values 
of these sums are then employed as the MC sampling weights (6). The numerator in (1) 
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can be obtained from an appropriate derivative of the generating function after the partial 
summation. 

The choice for sphtting of the integration domain is arbitrary and its effectiveness 
depends on the physics of the problem. The better the phase cancellation in the partial 
sum is, the more important the new MC sampling. One of the most convenient choices 
is the subspace of configurations over some maximal sublattice so that the partial sum 
can be exactly evaluated or well approximated. In particular, if the interactions are short- 
range (not necessary nearest-neighbour), maximal, non-interacting sublattices can always 
be chosen to provide a natural splitting. 

Illustrative examples 

To illustrate our method, we study the one-dimensional Ising model with complex activity 
or complex temperature. The model is simple, exactly soluble and yet sufficient for our 
purposes to show the improvements over the crude method and the workability of the new 
approach when the crude one fails. 

Although we are not interested in the models per se, they are of much physical rele- 
vance. Information about the phase transitions for physical values of parameters in the 
thermodynamic limits can be learned from the finite-volume partition functions in the 
complex plane. The Yang-Lee edge singularity [5], the distribution of Fisher zeros of the 
partition function, and hence their analyticity, in the complex temperature plane [6] have 
been studied. Furthermore, the I-D models of complex temperature can also be given the 
physical interpretation of a two-state quantum tunneling system in real time [7]. 

Owing to the nearest-neighbour interactions, the lattice can be partitioned into odd 
and even sublattices, of which the Ising spins Si (= ±1) on site z(g the sublattice) do not 
interact with each other. Absolute values of sums of the complex- valued weights over the 
even sublattice, say, are the new Monte Carlo weights. 

In our simulations, periodic boundary condition is imposed on the chains of 128 Ising 
spins which become 64 spins after the partial summation. Fnsemble averages are taken 
over 1000 configurations, out of 2^^® possible configurations. They are separated by 30 
heat bath sweeps which is sufficient for thermalisation and decorrelation in all cases except 
perhaps one, as will be demonstrated shortly. A heat bath sweep is defined to be one run 
over the chain, covering each spin in turn. The numbers of trials per sweep are different 
before and after the partial summation because the numbers of spins to be updated are 
not the same. Both hot and cold initial configurations are used and will be indicated when 
needed. 

The results for real coupling J and purely imaginary magnetic field H are summarised 
in Table I. They are evaluated with the crude probability density 
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Table 1: Purely imaginary magnetic field. The absolute statistical deviations are in square 
brackets. 



Coupling 


Magnetic Field 




Denominator 


Relative deviation 


0.1 


O.lz 


improved 


(0.6242,0.0010) [0.0247] 


3.96% 






crude 


(0.4330,0.0295) [0.0285] 


6.57% 


0.01 


O.lz 


improved 


(0.7187,0.0241) [0.020] 


3.06% 






crude 


(0.5088,0.0125) [0.0272] 


5.35% 


0.01 


0.3z 


improved 


(0.0237,0.0120) [0.0316] 


118.81% 






crude 


(-0.0071,0.0163) [0.0316] 


178.04% 



Table 2: Complex coupling and no external field. 



Coupling 




Denominator 


Percentage error 


(0.5,0.1) 


improved 
crude 


(0.7720,-0.3107) [0.0175] 
(0.5719,-0.2231) [0.0250] 


2.11% 
4.07% 


(0.1,0.1) 


improved 
crude 


(0.2804,0.9464) [0.0051] 
(0.1387,0.4787) [0.0274] 


0.51% 
5.50% 


(0.01,0.1) 


improved 
crude 


(0.9916,0.1284) [0.0005] 
(0.5154,0.0533) [0.0270] 


0.05% 
5.22% 


(0.01,0.5) 


improved 
crude 


(0.7620,0.6361) [0.0039] 
(-0.0155,0.0129) [0.0316] 
(-0.0292,0.0323) [0.0316] 


0.39% 
156.65% 
72.65% 



and also with the improved density 

KM) ~ n \cosh{J{s, + s,+,) + H)\. (9) 
odd sites 

The denominator measured is proportional to the partition function with different propor- 
tionality constants for different MC weights. Thus the relevant quantity here is the relative 
deviation defined as the ratio of the deviations in square brackets by the magnitudes of 
central values. Autocorrelation of the denominator as a function of number of sweeps is 
shown in Fig. 1 for both simulations at particular parameter values. 

Table 2 contains the results for complex coupling/temperature with no external field. 
The improved MC weights are as in (9) but with H = and complex J; and the crude 
weights as in (8) with J replaced by its real part. 

The use of table look up, which can be employed in discrete spin models, helps reducing 
the computing time of more complicated improved weights to approximately the same 
amount of time in dealing with crude weights. The improved MC offers consistently smaller 
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Figure 1: Real parts of the autocorrelation of ((1)) for J = 0.5 and H = O.lz from a total 
of 2 X 10^ configurations. The boxes are from crude weights; triangles, improved weights. 



5 



relative deviations. Apart form this, lurther gains are obtained over the crude method: 
only half of the spins, those residing on the odd sites of the original lattice, now needed to 
be considered; less number of sweeps, whose computing time again depends on the number 
of active spins, required for thermalisation and decorrelation; and acceptance rates are also 
slightly lower. 

If the gains in the example of purely imaginary field are about one order of magnitude 
or less (typically a factor of 6), they are much more significant in the complex temperature 
illustration. 

When the real part of the complex temperature is reduced relative to its imaginary 
part, the crude MC behaves worse as expected because of the increase in fiuctuation of the 
sign. The gain in relative deviations of the improved over the crude weights is 100 times 
when J = (0.01,0.1), for example. This translates into a factor of 10"* in configuration 
number if the error is inversely proprotional to ^J ^configurations. At the coupling value 
J = (0.01,0.5), in particular, the crude MC for both cold and hot starts behaves so badly. 
But the improved MC continues to work very well. It gives a definite non-zero value; while 
within the statistically errors given by the former, this value of J could have been taken 
as a zero of the partition function. The autocorrelation in Fig. 2 shows that the noise 
is too overwhelming in the crude simulation to tell whether 30 sweeps are sufficient for 
thermalisation or not. 

Of all the coupling values presented in Table 2, the last one also corresponds to the 
smallest value of the real part of the inverse of correlation length, Re((^~^) = 0.6, which is 
evaluated from the two eigenvalues of the transfer matrix [8]. From these eigenvalues we 
found that the absolute value of the partition function decreases with the couplings in the 
order presented [9]. And this is also the reason why the sign problem is getting worse for 
the crude weight. 

We have presented some measurements in Table 3 evaluated by exact, improved MC 
and crude MC methods respectively. Fxpressions for the magnetisation and susceptibility 
are obtained from appropriate derivatives of functions of corresponding partition functions. 
With the improved weights, we have 




(10) 



and 
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Table 3: Complex coupling. 



Coupling 




Magnetisation per spin 


Susceptibility per spin 


(0.1,0.1) 


exact 
improved 
crude 


(0,0) 

(0.0007,0.0000) [0.0024] 
(-0.0032,0.0016) [0.0062] 


(1.1971,0.2427) 
(1.1921,0.2459) [0.0356] 
(1.1651,0.2883) [0.1460] 


(0.01,0.1) 


exact 
improved 
crude 


(0,0) 

(-0.0006,-0.0001) [0.0021] 
(0.0023,-0.0080) [0.0055] 


(0.9999,0.2027) 
(1.0481,0.2231) [0.0268] 
(1.0868,0.1374) [0.1198] 


(0.01,0.5) 


exact 
improved 
crude 


(0,0) 

(-0.0031,-0.0047) [0.0027] 
(-0.0522,0.0820) [0.2074] 
(0.0123,-0.0723) [0.0830] 


(0.5512,0.8585) 
(0.5604,0.8163) [0.0428] 
(0.6808,5.4507) [9.2788] 
(1.5323,1.3885) [2.3122] 



where Nc is the number of configurations and 

ms}) = n rti'^i^'+^'^^ll, - (12) 

odd sites + 

We have also numerically summed one more spin on the remaining odd sublattice to 
further improve the improved MC weight, but there is no significant gain over the results 
presented above. 



Concluding remarks 

We have presented a method towards a solution for the sign problem. Owing to the sign 
cancellation in the partial sums, our approach can offer substantial improvements over 
the crude average sign method and may work even when the later fails, in the region 
of long correlation length and vanishing partition function. A particular splitting for the 
summation is chosen for our illustrative examples. And this is the natural choice of splitting 
which always exists for short-ranged interactions. But other choices of splitting are feasible 
and how effective they are depends on the physics of the problems. The approach of Ref. [7] 
could be considered as a special case with a particular and non-trivial splitting where the 
inner integration, over Y in our notation, was approximated in a certain manner. 

When the quantity to be averaged is not smooth on the length scale of the crude weight 
function, there is an additional source of systematic error in the average sign method. The 
cancellation in the partial sums may reduce this error by reducing the difference in length 
scales of the measured quantities and that of the sampling weights. 

Lastly on a speculative note, the partial sums might also be used as some pre-conditioning 
for the complex Langevin simulation which does not converge to the raw complex measures. 
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